Estimating Mixture Entropy with Pairwise Distances
نویسندگان
چکیده
Mixture distributions arise in many parametric and non-parametric settings—for example, in Gaussian mixture models and in non-parametric estimation. It is often necessary to compute the entropy of a mixture, but, in most cases, this quantity has no closed-form expression, making some form of approximation necessary. We propose a family of estimators based on a pairwise distance function between mixture components, and show that this estimator class has many attractive properties. For many distributions of interest, the proposed estimators are efficient to compute, differentiable in the mixture parameters, and become exact when the mixture components are clustered. We prove this family includes lower and upper bounds on the mixture entropy. The Chernoff α-divergence gives a lower bound when chosen as the distance function, with the Bhattacharyaa distance providing the tightest lower bound for components that are symmetric and members of a location family. The Kullback–Leibler divergence gives an upper bound when used as the distance function. We provide closed-form expressions of these bounds for mixtures of Gaussians, and discuss their applications to the estimation of mutual information. We then demonstrate that our bounds are significantly tighter than well-known existing bounds using numeric simulations. This estimator class is very useful in optimization problems involving maximization/minimization of entropy and mutual information, such as MaxEnt and rate distortion problems.
منابع مشابه
Correction: Kolchinsky, A. and Tracey, B.D. Estimating Mixture Entropy with Pairwise Distances. Entropy 2017, 19, 361
متن کامل
Mixture of Gaussians for distance estimation with missing data
Many data sets have missing values in practical application contexts, but the majority of commonly studied machine learning methods cannot be applied directly when there are incomplete samples. However, most such methods only depend on the relative differences between samples instead of their particular values, and thus one useful approach is to directly estimate the pairwise distances between ...
متن کاملDetermination of weight vector by using a pairwise comparison matrix based on DEA and Shannon entropy
The relation between the analytic hierarchy process (AHP) and data envelopment analysis (DEA) is a topic of interest to researchers in this branch of applied mathematics. In this paper, we propose a linear programming model that generates a weight (priority) vector from a pairwise comparison matrix. In this method, which is referred to as the E-DEAHP method, we consider each row of the pairwise...
متن کاملInformation geometry on hierarchy of probability distributions
An exponential family or mixture family of probability distributions has a natural hierarchical structure. This paper gives an “orthogonal” decomposition of such a system based on information geometry. A typical example is the decomposition of stochastic dependency among a number of random variables. In general, they have a complex structure of dependencies. Pairwise dependency is easily repres...
متن کاملA Semi-Definite Programming approach to low dimensional embedding for unsupervised clustering
This paper proposes a variant of the method of Guédon and Verhynin for estimating the cluster matrix in the Mixture of Gaussians framework via Semi-Definite Programming. A clustering oriented embedding is deduced from this estimate. The procedure is suitable for very high dimensional data because it is based on pairwise distances only. Theoretical garantees are provided and an eigenvalue optimi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Entropy
دوره 19 شماره
صفحات -
تاریخ انتشار 2017